Design and Implementation of a Predictive File Prefetching Algorithm
نویسندگان
چکیده
We have previously shown that the patterns in which files are accessed offer information that can accurately predict upcoming file accesses. Most modern caches ignore these patterns, thereby failing to use information that enables significant reductions in I/O latency. While prefetching heuristics that expect sequential accesses are often effective methods to reduce I/O latency, they cannot be applied across files, because the abstraction of a file has no intrinsic concept of a successor. This limits the ability of modern file systems to prefetch. Here we presents our implementation of a predictive prefetching system, that makes use of file access patterns to reduce I/O latency. Previously we developed a technique called Partitioned Context Modeling (PCM) [13] that efficiently models file accesses to reliably predict upcoming requests. We present our experiences in implementing predictive prefetching based on file access patterns. From the lessons learned we developed of a new technique Extended Partitioned Context Modeling (EPCM), which has even better performance. We have modified the Linux kernel to prefetch file data based on Partitioned Context Modeling and Extended Partitioned Context Modeling. With this implementation we examine how a prefetching policy, that uses such models to predict upcoming accesses, can result in large reductions in I/O latencies. We tested our implementation with four different application-based benchmarks and saw I/O latency reduced by 31% to 90% and elapsed time reduced by 11% to 16%. [email protected]. Supported in part by the Usenix Association and the National Science Foundation under Grant CCR-9704347. [email protected]. Supported in part by the National Science Foundation under Grant CCR-9704347.
منابع مشابه
Design and implementation of a model predictive controller for the COVID-19 spread restraint in Iran
In this paper, a model is proposed based on the different levels of social restrictions for the COVID-19 spread restraint in Iran. Also, a Genetic Algorithm (GA) identifies parameters of model using reported main data from the Iranian Ministry of Health and simulated data based on proposed model. Whereas Model Predictive Control (MPC) is a popular method which has been widely used in process ...
متن کاملThe Design and Implementation of Appointed File Prefetching for Distributed File Systems
Many types of distributed file systems have been in widespread use for more than a decade. One of key issues in their design is how to reduce the latency when accessing remote files, with the solutions including cache replacement and file-prefetching technologies. In this paper, we propose a novel method called appointed file prefetching, in which the main idea is to enable the user or system a...
متن کاملDesign and Practical Implementation of a New Markov Model Predictive Controller for Variable Communication Packet Loss in Network Control Systems
The current paper investigates the influence of packet losses in network control systems (NCS’s) using the model predictive control (MPC) strategy. The study focuses on two main network packet losses due to sensor to controller and controller to actuator along the communication paths. A new Markov-based method is employed to recursively estimate the probability of time delay in controller to ac...
متن کاملPath-Based Target Prediction for File System Prefetching
Prefetching is a well-known technique for mitigating the von Neumann bottleneck. In its most rudimentary form, prefetching simplifies to sequential lookahead. Unfortunately, large classes of applications exhibit file access patterns that are not amenable to sequential prefetching. More general purpose approaches often use models to develop an appropriate prefetching strategy. Such models tend t...
متن کاملImplementation and Evaluation of Prefetching in the Intel Paragon Parallel File System
The significant dijjerence between the speeds of the II0 system (e.g. disks) and compute processors in parallel systems creates a bottlerleck that lowers thepe$ormance of an applicatiorl that does a consideruble amount of disk accesses. A major portion of the compute processors’ time is wasted on wuitiqfor II0 to complete. This problem call be addressed ro a crrtairl. extent, if the Ilecessary ...
متن کامل